SINOD - Slovenian non-native speech database

نویسندگان

  • Andrej Zgank
  • Darinka Verdonik
  • Alexandra Zögling Markus
  • Zdravko Kacic
چکیده

This paper presents the SINOD database, which is the first Slovenian non-native speech database. It will be used to improve the performance of large vocabulary continuous speech recogniser for non-native speakers. The main quality impact is expected for acoustic models and recogniser’s vocabulary. The SINOD database is designed as supplement to the Slovenian BNSI Broadcast News database. The same BN recommendations were used for both databases. Two interviews with non-native Slovenian speakers were incorporated in the set. Both non-native speakers were female, whereas the journalist was Slovenian native male speaker. The transcription approach applied in the production phase is presented. Different statistics and analyses of database are given in the paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acquisition and Annotation of Slovenian Lombard Speech Database

This paper presents the acquisition and annotation of Slovenian Lombard Speech Database, the recording of which started in the year 2008. The database was recorded at the University of Maribor, Slovenia. The goal of this paper is to describe the hardware platform used for the acquisition of speech material, recording scenarios and tools used for the annotation of Slovenian Lombard Speech Databa...

متن کامل

Objective analysis of emotional speech for English and Slovenian Interface emotional speech databases

In this paper we propose a new approach for analysis of emotional speech prosody features. The aim of the analysis is definition of emotional features that characterise emotions. Analysis was performed on emotional speech databases that were recorded in the framework of the project "Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments" (Interface). Th...

متن کامل

Labeling of Prosodic Events in Slovenian Speech Database GOPOLIS

The paper describes prosodic annotation procedures of the GOPOLIS Slovenian speech data database and methods for automatic classification of different prosodic events. Several statistical parameters concerning duration and loudness of words, syllables and allophones were computed for the Slovenian language, for the first time on such a large amount of speech data. The evaluation of the annotate...

متن کامل

BNSI Slovenian broadcast news database - speech and text corpus

This paper presents the BNSI Slovenian Broadcast News database project. The result of the project is a database with speech and text corpus oriented toward large vocabulary continuous speech recognition in general domain. The speech corpus consists of 36 hours of transcribed evening and late night news. The raw database material was captured in the archive of national broadcaster RTV Slovenia t...

متن کامل

Speech Recognition of Slovenian and Croatian Weather Forecasts

In the paper we present some results of a joint project in speech data collection and speech recognition of Slovenian and Croatian weather forecasts. In the paper we describe the procedures we have performed in order to obtain a domain specific speech database from broadcast programmes. Additionally the speech recognition experiments are described and some speech recognition results for the Cro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006